On Identifying the Optimal Number of Population Clusters via the Deviance Information Criterion
نویسندگان
چکیده
Inferring population structure using bayesian clustering programs often requires a priori specification of the number of subpopulations, K, from which the sample has been drawn. Here, we explore the utility of a common bayesian model selection criterion, the Deviance Information Criterion (DIC), for estimating K. We evaluate the accuracy of DIC, as well as other popular approaches, on datasets generated by coalescent simulations under various demographic scenarios. We find that DIC outperforms competing methods in many genetic contexts, validating its application in assessing population structure.
منابع مشابه
Estimation of genetic parameters of litter size in Moghani sheep using threshold model via Bayesian approach
This study was conducted to estimate the genetic parameters of litter size (LS) in Moghani sheep using threshold model via Bayesian approach. The data originated from the Jafar-Abad Station of Ardabil province, Iran, and included 9698 lactation records of 4977 ewes with lambings from 1995 until 2010. The pedigree file consisted of data on animals born from 1987 to 2010. The significance of fixe...
متن کاملResponse to comment on 'On the inference of spatial structure from population genetics data'
The Deviance Information Criterion (DIC) has been originally introduced in the context of generalized linear models, and DeIorio and Robert (2002) and Celeux et al. (2006) pointed out some potential inconsistencies in the definition of the DIC for other families of models. However, Francois et al. (2008) do not give any information about there definition of the DIC to estimate the number of clu...
متن کاملHow Many Clusters? An Information-Theoretic Perspective
Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A natural question is how many clusters are appropriate for the description of a given system. Traditional approaches to this problem are based on either a framework in which clusters of a particular shape are assu...
متن کاملBayesian Model-Averaging in Unsupervised Learning From Microarray Data
Unsupervised identification of patterns in microarray data has been a productive approach to uncovering relationships between genes and the biological process in which they are involved. Traditional model-based clustering approaches as well as some recently developed model-based mining approaches for integrating genomic and functional genomic data rely on one’s ability to determine the correct ...
متن کاملOptimal Sensors Location Using Modal Assurance Criterion in Modal Identification of Concrete Gravity Dams
Determination of the optimal sensors location in order to identify of modal parameters, especially in large structures such as dams, is one of the practical issues, which is widely used in damage detection and structural health monitoring. The main objective of this study is to obtain the most information from the dynamic response in a concrete gravity dam by minimizing the non-diagonal element...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 6 شماره
صفحات -
تاریخ انتشار 2011